Living Decisions

Introduction

I'm about to graduate from The Fletcher School, and I want to make sure that IF I return to Boston post-covid, I find the right place to live. Enter this short, somewhat silly project. The purpose of this analysis is to determine where I should, based on factors I have identified as particularly important to my own individual happiness, including:

The Data

In case you would like to investigate the data I used for myself, you'll find it in the data folder, which contains the following files:

Variables

Maps

Analysis Overview

Here is a rough outline of the analysis I will perform:

  1. Read in and visualize the raster datasets
  2. Find the zip codes we'll be working with
  3. Impervious surface area > reclassify for impervious surface preferences (more impervious = less desirable)
  4. Land cover > reclassify for land cover preferences
  5. Tree canopy > reclassify for tree canopy preferences
  6. Farmers markets > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
  7. Libraries > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
  8. Restaurants > pull restaurant data from OpenStreetMap > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
  9. Resident age > calculate average age by zip code > join to zip code shapefile > rasterize shapefile > reclassify (close to 30 = more desirable)
  10. Calculate weighted and unweighted desirability raster from all of the reclassified rasters
  11. Mask desirability rasters to the shape of Boston
  12. Use zonal stats to determine the average desirability by zip code

Import Dependencies

Raster Checks

Before I get started with my variables, I want to check to see if all three of the rasters I'll be using have the same metadata, and load the metadata from one of them.

Great! Let's get started.

Our map of Boston

Let's see those ZIP code tabulation areas (ZCTAs). I'm also going to convert these to the proper crs, which in this case is EPSG:6491, Massachusetts Mainland, and select only the ZCTAs which are in the Boston area (MPO = Boston Region). To get us started, we'll bring in the outline of Massachusetts - it'll help us to remove any overlap the ZCTAs have with the coastline.

Now we'll bring in the ZCTAs and clip them according to the official state outline.

Great. This will be more precise. Before we can get into just those ZCTAs that make up the Boston area, we have to extract just the Boston region from the MA Metropolitan Planning Organizations (MPOs).

Beautiful. Now I know that some ZCTA boundaries could cross town (and therefore MPO) borders, so I'm going to find the centroid of each ZCTA and keep only those ZCTAs whose centroids are within the Boston Region MPO.

Let's also create a mask from the Boston Region MPO while we're at it - we'll use it later.

Now we have a map to work with. On to the variables!

Impervious Surface Area

I loaded the landcover raster above, so let's actually check that metadata and then see what the matrix looks like.

Looks pretty solid. Now I want to get rid of all that pesky water on the coast.

The values in this raster tell us the percentage of the pixel area that's covered by impervious surfaces. I'm going to reclassify the Impervious Surface Raster to my own tastes - I want to see things that aren't asphalt and experience less flooding. I'd also like to vote for green infrastructure with my feet.

Min Imperviousness (Inclusive) Max Imperviousness (Exclusive) Suitability Level Suitability Score
... 10% Very high suitability 5
10% 20% High suitability 4
20% 60% Medium suitability 3
60% 100% Low suitability 2
100% ... Very low suitability 1

Land Cover

Alright, let's load in that land cover raster and get started!

I can see those classifications there already. I'm going to get rid of some of them and just clip the raster to the area I'm interested in. And let's see what that looks like, too.

Gorgeous. Now let's reclassify that based on what kind of area I want to live in. The table below shows the land cover types I'll be working with:

Cell Value Land Cover
11 Open Water
12 Perennial Ice/Snow
21 Developed, Open Space
22 Developed, Low Intensity
23 Developed, Medium Intensity
24 Developed, High Intensity
31 Barren Land (Rock/Sand/Clay)
41 Deciduous Forest
42 Evergreen Forest
43 Mixed Forest
52 Shrub/Scrub
71 Grassland/Herbaceous
81 Pasture/Hay
82 Cultivated Crops
90 Woody Wetlands
95 Emergent Herbaceous Wetlands

And I'm going to assign suitability scores from one to five to these land cover types based on the values below (just my own personal preferences, really):

Land Cover Types Land Cover Codes Suitability Level Suitability Score
Developed (all) 21, 22, 23, 24 Very high suitability 5
Forest 41, 42, 43 High suitability 4
Grassland, shrub 52, 71 Medium suitability 3
Barren land, cropland 31, 81, 82 Low suitability 2
Wetlands, water, ice/snow 11, 12, 90, 95 Very low suitability 1

Just about the opposite of the impervious surface raster - but that's okay! Let's keep going. This is great.

Tree Canopy

Time for trees! Let's load this raster in and see what the trees look like.

Looks like 255 is the null value. I'm going to get rid of those and the non-Boston trees, and just clip the raster to the area I'm interested in. And let's see what that looks like, too.

Lots of good looking trees in there. Now let's reclassify that based on the percentage of trees I want to live under (hint: more is better):

Min Tree Cover (Inclusive) Max Tree Cover (Exclusive) Suitability Level Suitability Score
90% ... Very high suitability 5
75% 90% High suitability 4
50% 75% Medium suitability 3
25% 50% Low suitability 2
... 25% Very low suitability 1

It's starting to look like I don't want to live in Somerville.... Ah well. Onward!

Farmers Markets

Time to rasterize some farmers markets! This will be interesting - I want to create a distance raster for these.

What do these look like, where are they on a map?

Nice! Lots of concentration in towards Boston and Cambridge. Now to turn them into rasters!

Now to create the distance raster - we'll plot this one too, because it'll actually look like something.

So many farmers markets! I had no idea. Let's reclassify this based on how far I really want to walk for veggies.

Min Distance (Inclusive) Max Distance (Exclusive) Suitability Level Suitability Score
... .5 miles (805 m) Very high suitability 5
.5 miles (805 m) 1 mile (1609 m) High suitability 4
1 mile (1609 m) 2 miles (3218 m) Medium suitability 3
2 miles (3218 m) 5 miles (8046 m) Low suitability 2
5 miles (8046 m) ... Very low suitability 1

Definintely the weirdest looking map so far. To the library!

Libraries

Time to rasterize all these wonderful libraries! We want a distance raster here too.

Where are these libraries? Let's see them on a map.

Gorgeous. Now to create the distance raster - we'll plot this one too, because it'll actually look like something.

Even more libraries! I had no idea. Let's reclassify this based on how far I really want to walk for books. I'm using the same ratings as the ones for farmers markets.

Min Distance (Inclusive) Max Distance (Exclusive) Suitability Level Suitability Score
... .5 miles (805 m) Very high suitability 5
.5 miles (805 m) 1 mile (1609 m) High suitability 4
1 mile (1609 m) 2 miles (3218 m) Medium suitability 3
2 miles (3218 m) 5 miles (8046 m) Low suitability 2
5 miles (8046 m) ... Very low suitability 1

Another very funny map. Those polka dots definitely want me to live in Somerville, even if tree cover doesn't!

Restaurants

Time to get OpenStreetMap in here! Let's find those restaurants. I'm going to bring in all of the restaurants in the Boston Region MPO.

That's so many restaurants. Let's see some of the relevant columns to see if this looks right.

Hey, at least some of them have addresses. I'd like to fix the crs on this layer and then see what they look like.

Excellent. Time to rasterize those restaurants!

Now to create the distance raster - we'll plot this one too, because it'll actually look like something.

Heck yeah! Food everywhere. Let's reclassify this based on how far I really want to walk. I'm using the same ratings as the ones for farmers markets and books, because ultimately if I can't get there in 30 minutes, I can't get there at all.

Min Distance (Inclusive) Max Distance (Exclusive) Suitability Level Suitability Score
... .5 miles (805 m) Very high suitability 5
.5 miles (805 m) 1 mile (1609 m) High suitability 4
1 mile (1609 m) 2 miles (3218 m) Medium suitability 3
2 miles (3218 m) 5 miles (8046 m) Low suitability 2
5 miles (8046 m) ... Very low suitability 1

That's a lot of blue in the very city-ish areas in the middle, and my stomach's a pretty big factor here. I might end up in more developed ZCTAs than I expected.

Resident Age

This one might be a little strange, using tabular data to create a raster. But we'll see what we get - I probably won't weight this very heavily anyway.

Now we have the age data! But there are way too many variables here, we just want the important ones. Let's delete all the extra stuff.

I know the S0101_C02_008E column is the percent of residents aged 30-34, but I also know it was read in as an object - so let's convert that.

Now let's make that NAME column into something that'll actually work with the ZCTA geodataframe.

Excellent. And now I'll make a similar numeric column in the ZCTA file - just to be absolutely certain they'll talk to each other.

Perfect. Time to join this to the ZCTAs! Then we can rasterize this thing.

Dang. Very, very low percentages of people in their early 30's! Now it's time to rasterize it and see something really cool.

We have an age raster! Let's reclassify this to keep it in line with all the other variables. Based on the map above, I'm dealing with pretty low percentages of early 30's residents, so I'll adjust my expectations slightly.

Min Percentage 30-34 (Inclusive) Max Percentage 30-34 (Exclusive) Suitability Level Suitability Score
10% ... Very high suitability 5
7% 10% High suitability 4
4% 7% Medium suitability 3
1% 4% Low suitability 2
... 1% Very low suitability 1

Definitely the least pleasant of all the maps above - but still, there are a few dark blue ZCTAs!

Let's put everything together now.

Final Calculations

Well, we've made it this far looking at one variable at a time. But what do they all look like together? Let's put these scores together and see what happens!

Unweighted ZCTA suitability

Looks like all of those market, restaurant, and library dots were worth a lot! This is pretty neat - but how about weighted? I'll use the weights below. As you can see, my stomach drives all of my decisions.

Component Weight
Impervious surface 10%
Land cover 10%
Tree cover 10%
Distance to farmers markets 20%
Distance to libraries 15%
Distance to restaurants 20%
Percent of residents in early 30's 15%

Yep, that city area just keeps looking better and better. I can always ride my bike to the Fells.

But what if we look at zonal stats and see exactly which ZCTA is the most suitable?

I need to start by resetting the index on my ZCTA gdf, to make sure the mean values can map to it.

Looks good. Now let's create those zonal stats for the unweighted version and map it!

In this unweighted universe, what's the worst ranked ZCTAs for me to live in?

02047 it is! Don't think I'll go hang out there. What about the best?

02130 is in first place with a mean suitability score of 23.54. Pretty darn cool. That's Jamaica Plain!

But what about the weighted version?

Yep, those internal zip codes are calling to me. Out of all that blue, what's the best ZCTA for me?

02113! That's quite a coup. What about the worst? I see that dark red spot out in the northwest...

Interesting - weighting changed nothing! It's 02047 again.

It turns out that the most suitable place for me to live in the Boston region is the North End, and the worst is Humarock. I had never even heard of it - apparently only 180 people live there. The North End it is!

Well, if you need me, I'll be in the North End! Thanks for reading.